Escorted Free Energy Simulations: Improving Convergence by Reducing Dissipation 
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Nonequilibrium, "fast switching" estimates of equilibrium free energy differences, AF, are often 
plagued by poor convergence due to dissipation. We propose a method to improve these estimates 
by generating trajectories with reduced dissipation. Introducing an artificial flow field that couples 
the system coordinates to the external parameter driving the simulation, we derive an identity for 
AF in terms of the resulting trajectories. When the flow field effectively escorts the system along 
a near-equilibrium path, the free energy estimate converges efficiently and accurately. We illustrate 
our method on a model system, and discuss the general applicability of our approach. 



The estimation of free energy differences is a challeng- 
ing problem of central importance in computational ther- 
modynamics. The problem can be formulated as follows. 
Given two equilibrium states of a system of interest, at 
the same temperature f3~ 1 but different values of an ex- 
ternal parameter, A = A, B, how do we estimate the cor- 
responding free energy difference, AF = Fb — Fa? While 
the widely used thermodynamic integration and perturba- 
tion methods are based on equilibrium sampling, recently 
there has been interest in the use of nonequilibrium sim- 
ulations to estimate free energy differences |2| . In the 
most direct implementation of this approach, one repeat- 
edly simulates a thermodynamic process during which 
the parameter A is "switched" at a finite rate from A to 
B, with initial conditions sampled from equilibrium. AF 
is then estimated using the identity Q 



n=l 



(1) 



Here angular brackets denote an ensemble average over 
realizations of the process, W n is the work performed on 
the system during the n'th of iV such simulations, and 
the approximation becomes an equality as N — > oo. 

While Eq. |l| implies that we can determine AF using 
simulations of arbitrarily short duration ("fast switch- 
ing" [4n, wepay a penalty in the form of poor conver- 
gence [a, H, as the number of simulations needed to 
obtain a reliable free energy estimate using Eq. (JTJ) in- 
creases rapidly with the dissipated work, 



(W diss ) = (W) — AF > 0, 



(2) 



that accompanies fast switching simulations. This dissi- 
pation is a consequence of the second law of thermody- 
namics, and reflects the lag that develops as the system 
pursues - but is unable to keep pace with - the equi- 
librium state corresponding to the continually changing 
value of the work parameter, A [g, 0, Hoj] . We can di- 
minish the lag by running longer simulations, but this 
increases the computational cost per simulation. 

In this Letter we introduce a general strategy for im- 
proving the efficiency of fast switching free energy es- 
timates. In our approach, the "physical" equations of 



motion ordinarily used during a simulation are modified 
by the addition of an artificial flow field, u(z, A), that di- 
rectly couples the evolution of the system coordinates z to 
variations in the work parameter, A (Eq. [6|) . Our central 
result, Eq. (fT3|) . is an identity for AF in terms of tra- 
jectories generated with the modified dynamics. While 
this result is valid for an arbitrary, well-behaved flow field 
(reducing to Eq. (|TJ) when u = 0), the method is par- 
ticularly effective when this field is constructed so as to 
escort the system along a near-equilibrium path. In par- 
ticular, if u entirely eliminates the above-mentioned lag, 
then our method provides a perfect estimator of the free 
energy difference: W = AF for every simulation. 

Consider a classical system described by a Hamilto- 
nian H(z;X), or H\(z), where z specifies a point in d- 
dimensional phase space (or configuration space). At 
temperature /3 _1 , the equilibrium state of this system 
is described by the distribution 



p eq (z,A) = _exp[-/3F,(z,A)], 



(3) 



with free energy F\ = —(3 1 lnZ\. We are interested in 
the difference AF = F B - F A . 

We suppose that we have a preferred set of equations of 
motion for simulating the evolution of the system, which 
we write in the generic form 



z = v(z, A), 



(4) 



where z = dz/di, and v(z, A) typically contains both 
deterministic and stochastic terms. Examples include 
Hamilton's equations, Langevin dynamics, and the An- 
dersen and Nose- Hoover thermostats Q. (While we treat 
time as a continuous variable in this Letter, our method 
can be generalized to include discrete-time Monte Carlo 
dynamics Eq. ((4]) can be either stationary or ex- 



plicitly time-dependent, according to whether we hold A 
fixed or vary it with time. An ensemble of trajectories 
evolving under Eq. is described by a phase space 
density f{z,t) satisfying a Liouville-type equation, 



dt 



£a-/(M)- 



(5) 
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As in Refs. 12, TJJ, we assume C\ ■ e~^ Hx — 0, i.e. the 
equilibrium state is preserved when A is fixed. We will 
use the term physical dynamics to refer to the evolution 
described by Eq. (|TJ) (at the single-trajectory level) or 
Eq. ([5|) (at the ensemble level), to emphasize that these 
dynamics are intended to model, to some degree of real- 
ism, the microscopic evolution of our system of interest. 

Now suppose we modify Eq. (jTJ) by adding a term 
proportional to A = dX/dt: 



z = v + Au, 



(0) 



where u = u(z, A) is an arbitrary, continuous vector field 
on phase space. [3l[ With this additional, artificial term, 
every small increment of the work parameter, dA, induces 
a phase-space displacement, dz = u dA. Under these 
modified dynamics, the phase-space density satisfies 



df 
dt 



C x f - AV • (u/) 



(7) 



where the continuity term — AV • (u/) accounts for the 
flow Au. We now derive our central result, Eq. (fl3l). by 
generalizing the analysis of Hummer and Szabo [131 ] to 
include the A-dependent terms in Eqs. ([6]) and 
From Eq. ([7]), we have 



Aa 6 



-0H _ 



/?A[u.(V.ff)-/r 1 (V.u)]e-' Mr . (8) 



Now consider a specific protocol A t for varying the work 
parameter from Xq — A to X T — B, and consider the sink 
equation. 



(9) 



m =c ^ 9 - (3x Jx 9 > 

where we have introduced the compact notation 

^(z,A) = — +u-ViI-r 1 V-u. (10) 
Using Eq. (|SJ) we verify by inspection that the function 



0(M) 



1 

z2 



-0H(z,\ t 



(11) 



is a solution of Eq. ©. Independently, the Feynman- 
Kac theorem provides a path-integral solution of Eq. © 
HI 14 1 HI- Equating these two solutions, we get 



cxp [-0H(p, A t )] = (<5(z - z t ) cxp(-/3 Wt )) u , (12) 

where w t — f Q dt'X ($H/$X). Here, z t denotes a trajec- 
tory evolving under Eq. ^ as A is varied from A to B\ 
the integrand A (j)H/ (f)X is evaluated along this trajectory; 
and (• • • ) u indicates an average over an ensemble of such 
trajectories, with initial conditions sampled from equilib- 
rium. Setting t = t and integrating Eq. (|12j) over phase 
space, we obtain 



-f)AF 



(13) 



where 



W 



A^(z t ,A t )df 



(14) 



is interpreted as the work performed on a system evolving 
under Eq. ©. 

[We have derived Eq. (fT3"|) by equating two solutions 
of the sink equation (Eq. [5]): one obtained by inspec- 
tion (Eq. ITT]) . the other via path integration (right side 
of Eq. [T2]) . An alternative derivation proceeds by first 
defining g(z,t) = (S(z — z t ) exp(— /3wt)) u , then showing 
that this function satisfies Eq. ©.whose solution is in 
turn given by Eq. (fTTj) . See Refs. [12, [IB] for analogous 
derivations of Eq. (jTJ) .] 

Eq. (|13p implies we can estimate AF by taking the ex- 
ponential average of W (Eq. [TJJ, over trajectories evolv- 
ing under the modified dynamics (Eq. [6|) . This general- 
izes the usual fast switching method: we recover Eq. [TJ 
by choosing u = 0. Our approach also contains ele- 
ments of both the metric scaling 17[ and targeted per- 
turbation HI, strategies, reducing to a variant of the 
former in the case of linear flow fields, u = a(A) z, and to 
the latter in the limit of instantaneous switching, r — > 0. 
In that limit, the term v in Eq. ^ becomes negligible, 
and the trajectory evolves by integration along the flow 
field: dz\/dX = u(z\, A). 

While Eq. (fl~3|) is valid for any well-behaved flow field, 
the efficiency of the free energy estimate (the conver- 
gence of the exponential average) depends critically on 
the choice of u(z, A). The challenge then is to construct 
flow fields that render calculations using Eq. (| 13[) more 
efficient than those using Eq. (jTJ) . Since the typically 
poor convergence of Eq. (fT]) correlates with the lag that 
develops between the current state of the system (/) and 
the instantaneous equilibrium distribution (p cq ), it is rea- 
sonable to speculate that a flow field u which reduces this 
lag will improve the efficiency of the free energy estimate. 
To pursue this idea, let us imagine for a moment that 
we are able to construct a perfect flow field, u*, that 
eliminates the lag entirely. In this case the distribution 
/(z, t) = p eq (z, A t ) is a solution of Eq. ||7J). Substituting 
this solution into Eq. ([7]), we get, using C\ ■ p cq = 0, 



dp c 



ox 



+ V • (u*p' 



0. 



(15) 



Setting p eq = e^ F H \ we obtain 



V-u-.f, (16) 



therefore 



W = 



\§dt= [ T X§dt = AF (17) 
o @X J dA 



for every trajectory z t . Thus, for a perfect flow field u*, 
there is no dissipation (Wdiss = 0) and a single trajectory 
provides the correct free energy difference. 
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Although on general grounds we expect that a perfect 
flow field typically exists, [32j it seems unlikely we will be 
able to solve for u* analytically, apart from a few simple 
systems. (Indeed, Eq (fT5|) suggests that an expression 
for dF/dX is required to obtain u*.) However, by reveal- 
ing that elimination of the lag results in a zero-variance 
estimator of AF, Eq. (fTF]) supports our earlier specu- 
lation: if we can construct a flow field that reduces the 
lag, then we should expect improved convergence of the 
exponential average. We now illustrate this idea. 

Consider Sun's one-dimensional model system (20l |. 



H(p,q,\) = ^-+q 4 
2m 



-16(1-A)(7 2 



P 

2m 



■V(q,X). (18) 
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For A = 0<A<1 = F, the potential energy pro- 
file V(q, A) is a double well, with minima at ±gn(A) = 
±iV8(l — A) separated by a barrier of height 64(1 — A) 2 
(Fig. [TJ. Setting (3 — 1, the equilibrium distribution 
is bimodal and sharply peaked around ±go; as A — > 1 
the two peaks coalesce as V becomes a single, quartic 
well. Analytical evaluation of the partition functions 
gives AF = Fb — Fa = 62.94... pjj. 

The direct application of Eq.[T]to this model gives poor 
results when the switching is performed rapidly [ljj [2(| • 
A typical simulation begins with the system near ±gn(0); 
then, as A is varied from to 1, the two minima at ±gn(A) 
approach one another, but the system lags behind, result- 
ing in large dissipation and poor free energy estimates. 
This is illustrated by the open circles in Fig. [2j obtained 
from simulations during which the system evolved under 
Hamilton's equations, integrated using the velocity Ver- 
let algorithm. Only for r = 1 does Eq. (fTJ) provide an 
accurate estimate of AF. (The systematic error evident 
in Fig. [2] arises after taking the logarithm of both sides 
of Eq. O (ll|.) 



300 



200 



> 



100 



-100 




FIG. 1: The potential energy landscape for A = (solid line) 
and A = 1 (dashed line). Also depicted are the equilibrium 
distribution and the flow field, at A = 0. 



FIG. 2: Comparison of estimates of AF using Eqs. |T} and 
(|13[) . We performed simulations for switching times ranging 
from t = 0.01 to r = 1.0. Each AF^ was obtained using 10 6 
trajectories, evolving under either Eq. (open circles) or 
Eqs. ((SJl, (|19[1 (filled circles). Error bars were computed using 
the bootstrap method; for the filled circles these were smaller 
than the symbols, and are not shown. 



To illustrate the application of Eq. (jT3|) , let us take 



u(q, A) = — tanh [64(1 - X)q q\ 



(19) 



with qg = go (A) as given above. This field acts only 
on the coordinate q, and not on the momentum p. We 
arrived at Eq. (flU)) by using crude approximations to 
estimate the solution of Eq. (|16[) . modeling p cq as a pair 
of Gaussians. Omitting the details of this calculation, 
we note that near either peak of p eq , u(q, A) displaces 
the system toward the origin at a speed A|u| ~ Adqn/dA 
(see Fig. [T]). This is the speed at which the two minima 
of V(q, A) approach the origin. Intuitively, we expect this 
flow to reduce the lag between / and p eq . 

We repeated the simulations described above, now 
adding the term Au to the dynamics. The resulting 
estimates of AF, obtained using Eq. (fTB"]) and de- 
picted as filled circles in Fig. [21 are remarkably accu- 
rate over the entire range of switching times. Indeed, for 
all r = 0.01, • ■ ■ , 1.0, the work values W were sharply 
peaked around AF (data not shown), confirming that 
the flow field escorts the system through a sequence of 
near-equilibrium states, even when A is switched rapidly. 
We stress, however, that this choice of flow field is neither 
perfect (a / «*) nor unique. In particular, we expect it 
could be improved near A = 1, where the approximations 
made on the way to Eq. (fT9|) break down. 

As illustrated by this simple, proof-of-principlc exam- 
ple, the key to success with our method is a flow field 
u that reduces lag, and therefore dissipation, by mim- 
icking the effect of a variation of A on the distribution 
p cq . In general, such flow fields might be constructed 
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using physical insight, experience and prior knowledge 
of the system, perhaps with iterative adjustment to im- 
prove convergence. [33| We expect that, with trial and 
error, flow fields appropriate to a variety of problems will 
be developed. For instance, (f ) particle insertion into a 
fluid, (2) cavity growth in a fluid, and (3) the charging of 
a solute, all represent in silico thermodynamic processes 
for which we have some intuition regarding the atomic 
rearrangements that accompany the pro cess. Prelimi- 
nary calculations, reported elsewhere ll|, confirm that 
such intuition can be used to design flow fields that sig- 
nificantly speed up the estimation of AF. Our method 
might also be combined with steered molecular dynam- 
ics [H, HI], in which a constraining potential is used to 
drag a coordinate £ along a desired path £ t . By adding 
a flow field that acts on this coordinate and others cou- 
pled to it, one might be able to reduce the lag between 
£ and £t. For free energy calculations along a reaction 
path for which we do not have good intuition, transition 
path sampling [2~3 | could provide information useful for 
designing an effective flow field. 

The method we propose is distinct from path-space 



sampling schemes [20|, |25|, |26j, |27J , in which the conver- 
gence of Eq. |T|) is improved by modifying the probabil- 
ities with which physical trajectories are generated, for 
instance by biasing in favor of small work values. In our 
approach, by contrast, we modify the equations of motion 
themselves, thereby sampling from an entirely different 
set of trajectories. (Thus in the above example, we gener- 
ated non-Hamiltonian trajectories, rather than a biased 
sampling of Hamiltonian trajectories.) The distinction is 
particularly evident in the case of a perfect flow field u* , 
when every trajectory gives W = AF. 

Finally, while Eq. (fT"3]) is specifically a generalization of 
Eq. {1} , the approach we take is readily extended to other 
nonequilibrium identities for free ener gy differences, in- 
cluding Crooks's fluctuation theorem [281 ] and Hummer 
and Szabo's identity for potentials of mean force 
Moreover, it would be interestin g to combine our ap- 
proach with the large time step [29( and optimal pro- 
tocol [30( strategies, recently proposed for improving the 
efficiency of free energy estimates. 
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provided by the University of Maryland. 
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